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the 15th asilomar conference on Mass Spectrometry 
this October was devoted to the role of mass spectrom- 
etry (MS) in proteomics. The Asilomar Conference site 
is in a picturesque national park in Pacific Grove, CA, 
overlooking the Pacific Ocean. The conference aims to 
bring together scientists from a cross section of disci- 
plines that are applying MS to an emerging field. This 
year, that emerging field is proteomics. The term 
"proteome" was coined by Wilkins et al. (17) in the 
mid-1990s to describe the protein complement of the 
genome. The term was first used to describe the 20-yr- 
old field of two-dimensional gel electrophoresis (2-DE) 
and quantitative image analysis. 2-DE remains the 
highest resolution protein separation method avail- 
able, but the ability to identify the observed proteins 
was always an extremely difficult problem. MS has 
been integral to solving that problem. Although improve- 
ments in 2-D gel technology had been realized since its 
introduction, three enabling technological advances 
have provided the basis for the foundation of the field of 
proteomics. The first advance was the introduction of 
large-scale nucleotide sequencing of both expressed 
sequence tags (ESTs) and, more recently, genomic 
DNA. The second was the development of mass spec- 
trometers able to ionize and mass-analyze biological 
molecules and, more recently, the wide-spread introduc- 
tion of mass spectrometers capable of data-dependent 
ion selection for fragmentation (MS/MS) (i.e., without 
the need for user intervention). The third was the 
development of computer algorithms able to match 
uninterpreted (or partially interpreted) MS/MS spectra 
with translations of the nucleotide sequence databases, 
thereby tying the first two technological advances 
together. Thus MS played a key role in the passage of 
2-DE/image analysis to proteomics. 

As a note to readers unfamiliar with MS, the instru- 
ments are named for their type of ionization source and 
mass analyzer (see also Refs. 1, 11, 12). To measure the 
mass of molecules, the test material must be charged 
(hence ionized) and desolvated (dry). The two most 
successful mechanisms for ionization of peptides and 
proteins are matrix-assisted laser desorption ioniza- 
tion (MALD1) and electrospray ionization (ESI). In 
MALDI the analyte of interest is embedded in a matrix 
that is dried and then volatilized in a vacuum under 
ultraviolet laser irradiation. This is a relatively effi- 
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cient process that ablates only a small portion of the 
analyte with each laser shot. Typically, the mass ana- 
lyzer coupled with MALDI is a time-of-flight (TOF) 
mass analyzer that simply measures the elapsed time 
from acceleration of the charged (ionized) molecules 
through a field-free drift region. The other common 
ionization source is ESI, in which the analyte is sprayed 
from a fine needle at high voltage toward the inlet of the 
mass spectrometer (which is under vacuum) at a lower 
voltage. The spray is typically either from a reversed- 
phase HPLC (RP-HPLC) column or a nanospray device 
(19) that is similar to a microinjection needle. During 
this process, the droplets containing analyte are dried 
and gain charge (ionize). The ions formed during this 
process are directed into the mass analyzer, which 
could be either a triple-quadrupole, an ion trap, a 
Fourier-transform ion cyclotron resonance (FT-ICR), or 
a hybrid quadrupole TOF (Qq-TOF) type. 

This Asilomar meeting provided one of the largest 
academic forums in the United States for the presenta- 
tion and discussion of MS as it is applied to proteomics. 
As is obvious from the introduction, the initial role MS 
played was as a protein identification and characteriza- 
tion methodology. However, the role of MS is expanding 
in this field. Although a series of talks focused on the 
use of different kinds of MS to identify gel-separated 
proteins and the various automation technologies ap- 
plied to perform this in high throughput, several talks 
also presented alternate approaches. These approaches 
utilized direct analysis of digested protein mixtures for 
either identification of the components or quantitative 
analysis of two different samples mixed together. Spe- 
cific biological applications were also presented. As 
described above, a critical component of any MS ap- 
proach as applied to proteomics is the computational 
analysis. This report will be divided to focus on these 
six aspects of MS in proteomics. Where references are 
known for some of the material presented, they are 
cited. The program was, however, not entirely limited 
to MS in proteomics. Prior to the six sections covering 
the conference core, the first section of this report 
covers those presentations that were aimed at provid- 
ing an insight into broader biological and drug discov- 
ery processes. 

Proteomics in biology and drug discovery. The open- 
ing lecture, given by Lee Hood (Univ. of Washington), 
provided an excellent overview of Genomics, Proteom- 
ics, and Systems Biology. Hood described the genome 
project efforts that provide four types of maps: genetic, 
physical, gene, and sequence. For the human genome, 
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it is anticipated that 90-95% of all genes will be 
sequenced sometime next year. This is the first step 
toward what Hood described as the "Periodic Table of 
Life." The different approaches to genomic sequencing 
and microarray technologies that are able to interro- 
gate the mRNA levels of thousands of genes at a time 
were described. Hood described proteomics in broad 
terms as the study of multiplicity of proteins. The 
information obtained from the various hierarchical 
levels of biological information {gene, protein, path- 
ways, interconnecting pathways) must be integrated 
for us to be able to provide a more complete biological 
picture. For both microarray and proteomics, samples 
representing the disease process must be obtained. 
This means that pure cell populations must be micro- 
scopically captured from tissues and/or sorted prior to 
analysis. Therefore, analyses at the mRNA and protein 
level must be conducted at very low levels and substan- 
tial engineering opportunities exist in the biological 
field to provide the necessary solutions. However, gen- 
eration of the data is only the first hurdle, as the 
analysis of the data from a systems perspective then 
must be undertaken. Hood presented systems biology 
as the challenge for the 21st century and provided a 
series of examples of large-scale approaches to biology, 
from genome sequencing of unicellular organisms, to 
the sequencing of the T-cell receptor locus, to cancer 
biology, all of which benefit from such approaches. 

Three other presentations were included in the pro- 
gram, to provide a broader background to the utiliza- 
tion of proteomics in drug discovery. Doug Buckley 
(Exelixis) described the generic view of the drug discov- 
ery pipeline, the various "choke" points in the process, 
and where proteomics could play a role. Of note was the 
discussion of the changing patent protection landscape, 
during which Buckley said that full-length cDNA pat- 
ents were being issued despite the existence of EST 
patents on portions of these genes. Buckley also pre- 
dicted that functional data is expected to be required 
for patents beyond the inferences gained from bioinfor- 
matics. The choke points he referred to were target 
validation, assay development, mechanistic biology, 
and toxicology. Exelixis is using model organisms (Cae- 
norhabditis elegans, Drosophila, mouse, and zebrafish) 
to screen for genes that disrupt/modulate pathways 
common between man and these organisms. Roles for 
proteomics included follow-up on targets (direct analy- 
sis of protein differences, proteins associated with gene 
products of interest), assay development [e.g., valida- 
tion of hits in high-throughput screening (HTS)] t and 
mechanistic biology (e.g. comprehensive analysis of a 
knockout phenotype). Most importantly, Buckley pre- 
sented the bottom line that all new technologies must 
demonstrate their worth by concrete changes in the 
drug development pipeline (i.e., greater efficiency, bet- 
ter decisions). He predicted that proteomics could pro- 
vide these benefits at the multiple restriction points 
referred to above. 

Pharmacoproteomics, using 2-DE to profile mecha- 
nisms of drug efficacy and toxicity, was presented by 
Tina Gatlin (Biosource/Large Scale Biology Corpora- 



tion). The synergy between mRNA expression profiling 
(for low-abundance gene products) and protein expres- 
sion profiling (for posttranslational modifications and 
subcellular localization) was presented. An exception to 
this is the search for surrogate markers, where secreted 
proteins were normally the choice and in which there is 
no identifiable mRNA source to mirror serum or urine 
protein expression. The aim of their Molecular Effects 
Database of 2-DE patterns, obtained from livers of 
drug-treated rats, is to establish links between expres- 
sion patterns and toxic endpoints to reveal markers for 
efficacy and prediction of side effects which can be used 
for lead selection. In disease models, the hypothesis is 
that the altered expression pattern could be reversed 
by treatment with a drug. 

The closing presentation of the meeting, given by Jeff 
Seilhamer (Incyte), presented analyses of the precursor 
to proteins, mRNA. The staff at Incyte have generated 
very large EST libraries and from these have estimated 
the number of genes in the human genome to be 
129,769 (based on CpG island estimates, 142,634). 
They are now sequencing the human genome at a rate 
of about 1 million reads a month on the Megabace 
platform with 9 sequencing runs/day. Assembly of the 
data is being accomplished using Linux on 1,500 CPUs 
(160 computers) with 75 terabytes of storage capacity. 
Single-nucleotide polymorphisms (SNPs) are being cal- 
culated from their sizable EST collection, and mRNA 
expression profiling is being achieved using their GEM 
microarray platform. These data are being integrated 
with 2-DE proteomics data being generated by their 
partner Oxford GlycoSciences. This integration of the 
technologies of genomics and proteomics forms the 
basis of their drug discovery approach for profiling 
differences between normal and diseased tissue. 

Computational aspects of proteomics. Determining 
the masses of peptides (MS spectra) derived from 
enzymatic digestion of gel-separated proteins is often 
the first step in a mass spectrometric-based protein 
identification strategy. Peptide-mass mapping is the 
most commonly employed mass spectrometry ap- 
proach for protein identification from organisms whose 
genome is completely sequenced (or at least for which 
the more abundantly expressed genes have been se- 
quenced). The basis of the method is the matching of 
experimentally determined peptide masses with pep- 
tide masses calculated for each entry in a sequence 
database (using the specificity of the enzyme used to 
generate the experimental data). How well the experi- 
mentally determined masses match with the calculated 
masses forms the basis of the approach. Ron Beavis 
(Proteometrics) described how to obtain high-quality 
data, which even if less, are better than more low- 
quality data. The use of specific matrices as well as the 
use of standards with respect to obtaining appropriate 
data sets for peptide-mass mapping was addressed. 
Later in the day David Fenyo (Proteometrics) described 
how to utilize this data in a three-step process as is 
performed in their WWW-available program, Profound, 
which uses a Bayesian algorithm (http://www. 
proteometrics.com). The process is as follows: 
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/) assignment of monoisotopic masses to the raw data, 
2\ peptide-mass search, and 3) significance testing of 
the result (4). The last step was presented as the most 
critical because it is from this that the confidence of the 
match is derived. This is achieved through calculation 
of a score frequency function for false positives. This 
was derived from statistical analysis of the database 
being searched using random selections of peptide 
masses from different proteins that are then grouped as 
synthetic proteins and used in a peptide-mass search of 
the database in question. This is repeated for a variety 
of random selections to come up with robust statistics 
for false positives. 

The next level of protein identification is the genera- 
tion of fragment ion spectra from peptides isolated in 
the gas phase of the mass spectrometer (MS/MS spec- 
tra). Matching of fragment ion spectra follows the same 
principle as for peptide-mass mapping. Experimentally 
calculated masses of fragment ions (together with the 
intact mass of the peptide, and often the specificity of 
the enzyme used to generate the peptide) are matched 
with those calculated for isobaric peptides (i.e., same 
mass as experimentally determined) from entries in 
sequence databases. Arthur Moseley (Glaxo Wellcome) 
described how nanoscale capillary LC-MS/MS (where 
peptides are separated chromatographica'lly before MS/ 
MS) had been automated for identification of gel- 
separated proteins. The throughput of this 75 pm ID 
capillary system connected to a Qq-TOF mass spectrom- 
eter was 20 samples per day at levels to 30 fmol (loaded 
on gel) for BSA. Moseley continues to develop ultra- 
HPLC (in some cases combined with variable flow 
systems) that improve both the speed and resolution of 
separation. In a Glaxo Organelle Proteomics program, 
various approaches to protein identification were exam- 
ined. A comparison of the total number of proteins 
identified following in situ enzymatic digestion of pro- 
teins separated by either high-resolution 2-DE or one- 
dimensional (usually SDS-PAGE) gel electrophoresis 
(1-DE) was presented. Only one or a limited number of 
proteins are present in each of the 2-DE spots, whereas 
many proteins were present in the 1-DE bands of the 
enriched Golgi complex. In fact, more proteins were 
identified from the 1-DE bands than from the 2-DE 
spots (see below, Analysis of complex protein mixtures 
without gel electrophoresis). 

MS/MS spectra derived from tryptic digestions con- 
ducted in the presence of equal quantities of H 2 16 0 and 
H 2 18 0, when combined with subtractive analysis of the 
two types of spectra, allows de novo sequencing as 
described by Matthias Wilm (EMBL) (18). By utilizing 
a Qq-TOF mass spectrometer, peptides containing both 
COOH-terminally incorporated stable isotopes and just 
the isoform containing the 18 0 could be selected for 
fragmentation from the mixture. Subtraction of the l8 0 
spectrum from the 16 0: I8 0 spectrum reveals only l6 0 
7-series ions. Often, a complete ion series is obtained. 
The method has proved feasible in their hands when 
1 pmol of protein is present in the gel (1/4 of this 
amount can be successfully analyzed with standard 
digest conditions). 



Automated identification of gel-separated proteins by 
mass spectrometry Following quantitative analysis of 
2-DE patterns, the next step is the identification of all 
protein spots that display differential expression. An- 
drew Gooley (Proteome Systems) described the ap- 
proaches they are employing for quantitative analysis 
using 2-DE. This included the following: sample prepa- 
ration (sequential detergent extraction with aminosul- 
fobetaine-14), narrow-range immobilized pH gradient 
(IPG) with mini-gels for the 2nd dimension, through to 
the robotic system that they have codeveloped for spot 
excision, liquid handling (peptide extraction and re- 
verse-phase bead cleanup and storage) and peptide- 
mass fingerprinting by MALDI-MS. Apart from the 
throughput of the robotic system, diminished contami- 
nation from keratin and more reproducible spotting of 
samples for MALDI-MS is a highly desirable feature of 
automation. Hans-Werner Lahm (Hoffmann-La Roche) 
described the high-throughput system they use for 
automated spot excision from 2-DE, digestion (with 
low-salt buffer to eliminate the need for cleanup), and 
spotting for automated MALDI-MS. Lahm also de- 
scribed the computational aspects of operating such a 
system in high-throughput mode for long periods of 
time, including automated database search routines for 
users distributed throughout the world at other Roche 
sites. They are investigating the use of stable isotope 
labeling ( 14 N/ I5 N) followed by mixing of each sample 
prior to 2-DE for direct quantitation of relative expres- 
sion differences from the MALDI-MS spectra of indi- 
vidual protein spots. The system averages 1,000 spots 
to spectra per day (including downtime). 

David Arnott (Genentech) described automation of 
in-gel digestions following analysis of differentially 
regulated proteins from 2-DE. Arnott described the 
trapping cartridge approach that was required to ana- 
lyze extracted peptides from the DigestPro robot (cur- 
rently 30 sample spots, but upgradeable to 96) by 
microcapillary LC-MS/MS. They aimed to automate as 
much of the sample processing as possible with auto- 
mated liquid handling from the digestion robot to the 
data-dependent LC-MS/MS (capable of handling 40 
samples per day) using an ion-trap mass spectrometer 
followed by auto-database searching using Sequest (3). 
The system is capable of analysis of subpicomolar 
quantities of protein from silver-stained gels. 

Advances in separations and mass spectrometers. 
Accurate mass analysis of intact proteins using an 
11.5-T FT-ICR coupled with a capillary electrophoresis 
(CE) instrument was demonstrated by Richard Smith 
(Pacific Northwest National Laboratory) as a means of 
proteome analysis. Through the use of stable isotope 
labeling of one sample and running that sample with 
an unlabeled sample provides the possibility to mea- 
sure protein expression ratios. To identify the proteins 
that display different ratios, dissociation in the FT- 
ICR-MS to yield mass tags is possible. Having intact 
mass information as well as identification allows post- 
translational modifications to begin to be investigated. 
The mass accuracy obtainable by this FT-ICR-MS was 
said to be <0.75 ppm which allows the generation of 
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accurate mass tags for tryptic peptides. In many cases 
this may be sufficient for protein identification (at least 
for an organism like C. elegans). In some cases. MS/MS 
may be required, but once performed it would not have 
to be repeated. Another possibility is the identification 
of cysteine-containing peptides at a mass accuracy of 
1 ppm, which was said to be sufficient for identification. 
Another possibility being explored with this instru- 
ment is multiplexed MS/MS, where up to 7 ions could 
be isolated at once and the MS/MS spectrum could be 
deconvoluted for each selected ion (requires accuracy of 
<10 ppm). This will be tried with online separations in 
the near future. 

Marvin Vestal (PE BioSystems) described his continu- 
ing efforts in MALDI-MS instrument design. The at- 
tributes he is aiming for include sensitivity, specificity 
(resolution, mass accuracy selective ionization), speed, 
accuracy, dynamic range, and mass range. The sensitiv- 
ity will always be limited by chemical noise, but the aim 
is to reduce the limitations of ionization and data 
handling. Vestal would like to achieve a sensitivity of 
1 fmol with a data acquisition rate of 1 spectrum/s. The 
instrument he is designing to achieve these aims is a 
MALDI-TOF/TOF-MS. This system has an ion gate 
(with 500 resolution and no loss of sensitivity) after the 
collision cell so that metastable ions created after 
reacceleration are removed. Although this system is in 
the early stages of development, the data shown demon- 
strate that this instrument is meeting most of the 
stated objectives. 

The hybrid quadrupole TOF (Qq-TOF) mass spec- 
trometer developed a few years ago (9), which has now 
been commercialized, utilizes an ESI for ionization 
(10). Both Ken Standing (Univ. of Manitoba) and Brian 
Chait (Rockefeller Univ.) described the use of a MALDI 
ion source for introduction of ions into a modified commer- 
cial Qq-TOF, thus taking advantage of both the high- 
efficiency ion production of the MALDI and the ion isolation/ 
fragmentation of the quadrupole system with a TOF mass 
analyzer. Standing presented data showing sensitivity of 
purified standards (e.g., Substance P) in the 70 amol range 
(1-min acquisition) for MS and 7 fmol for MS/MS with 
1 0,000 resolution. This instrument offers similar advan- 
tages to the TOF/TOF described above. 

Online MS analysis of capillary electrophoretic or 
chromatographic separations of peptide (or proteins) is 
most often achieved using ESI-MS. Barry Karger (Bar- 
nett Institute) described how very small quantities of 
peptides/proteins could be separated and analyzed 
using vacuum deposition onto Mylar audio tape for 
subsequent coupled MALDI-MS analysis. The ap- 
proach had so far been multiplexed with the effluent of 
12 capillaries being deposited under vacuum onto the 
tape. The approach is designed for high-throughput 
separations and mass analysis. 

Proteomic analyses often employ 2-DE, but David 
Lubman (Univ. of Michigan) described a liquid-phase 
2-D separation of proteins utilizing a novel MS. The 
requirements of his mass spectrometer were high sensi- 
tivity, low duty cycle, and fast response. He designed 
and built an ion trap to capture ions from the CE 



coupled to TOF-MS. The 2-D liquid-phase separation 
consisted of nonporous silica bead RP-HPLC (which 
provided good resolution <50 kDa) that was conducted 
at high pH followed by CE and MS. Whole cell lysates 
were analyzed with this system, and some of the data 
obtained were presented. 

Biological applications. Brian Chait (Rockefeller 
Univ.) presented the culmination of an enormous 
amount of work at both the protein chemistry (mass 
spectrometry) and cell biology levels. The nuclear pore 
complex (NPC) in yeast is a massive structure (1,000 A 
across with 8-fold symmetry) that regulates protein 
transport in/out of the nucleus. The first step in under- 
standing this structure was to purify the complex and 
then identify every protein present. The protein frac- 
tion was separated by several different chromato- 
graphic steps followed by SDS-PAGE from which every 
visible band was excised and analyzed by MALDI-IT- 
MS. This was an especially daunting task as the NPC 
when isolated contains a snapshot of the proteins 
transiting the NPC at that point in time. Hence, of the 
174 proteins identified, 29 were nucleoporins and only 
14 were shown to be present in the NPC. These 14 
proteins were characterized as being present in the 
NPC by a variety of techniques. Protein A (4.5 repeats 
of the Fc binding region) fusions with the proteins of 
interest were generated, and immunohistochemistry 
was performed on cells transfected with these con- 
structs. Electron microscopy of hundreds of NPCs 
following transfection allowed stoichiometry and sym- 
metry (nuclear/cytoplasmic/asymmetric) to be deter- 
mined. Subcellular fractionation and high-pH extrac- 
tions were also performed to further characterize 
localization biochemically This elegant study has al- 
lowed a testable model for nuclear transport to be 
constructed. 

Two examples of the utility of analysis of unfraction- 
ated or partially fractionated complex protein mixture 
digests (see next section) were presented by Scott 
Patterson (Amgen Inc.). As a first step in the under- 
standing of the interchromatin granule clusters (IGC), 
a nuclear organelle which is a major site of mRNA 
splicing. Samples enriched in this structure were di- 
gested with trypsin, and the complex mixture of pep- 
tides was analyzed by data-dependent LC-MS/MS (8). 
Some proteins known to be present in these structures 
were identified together with 19 novel genes (including 
ESTs). Three of the genes were confirmed to be present 
in the IGC by immunohistochemistry of cells trans- 
fected with yellow fluorescent protein (YFP) -fusion 
constructs with counter staining of splicing factors. The 
other study presented identified 108 proteins present 
in a protein fraction obtained from isolated mitochon- 
dria treated with atractyloside [mimicking in vitro the 
permeability transition pore complex (PTPC) which 
occurs during apoptosis] (13). 

Analysis of immunoprecipitates using a new affinity 
strategy was presented by Gitte Neubauer (EMBL). 
The new strategy is referred to as tandem affinity 
purification (TAP) and was developed by colleagues at 
EMBL (15). The system utilizes a double tag for higher 
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specificity and much reduced background. The human 
spliceosome immunoprecipitated under normal condi- 
tions (see Ref. 5 Tor same approach with yeast tri- 
snRNP) and using the TAP method were compared, 
demonstrating the utility of this approach. 

The common theme of all of these applications is that 
MS was utilized early on to provide rapid and accurate 
protein identifications. The genes identified could then 
be further analyzed to attempt to determine their 
function. 

The use of MS to identify proteins from 2-DE gels was 
! also described by Al Burlingame (Univ. of California, 
San Francisco) and Reid Townsend (Oxford GlycoSci- 
ences). Burlingame described their work to identify 
protein targets of acetaminophen during acute toxicity 
and the intricacies of such analyses (14), Townsend 
described an Oxford GlycoSciences and Pfizer collabora- 
tion to separate by 2-DE and identify proteins from 
cerebrospinal fluid (CSF) in a study aimed at identify- 
ing markers for Alzheimer's disease. CSF is a compart- 
ment isolated by the blood-brain barrier but it is not 
just a filtrate of blood. It is produced by the choroid 
plexus and has a total central nervous system volume 
of about 90-150 ml that is turned over a few times per 
day. Comparative analysis of matched plasma CSF 
samples (in addition the normal/diseased samples) 
revealed that key plasma proteins (e.g., albumin, trans- 
ferrin, IgG) showed markedly different relative ratios 
between plasma and CSF. For effective 2-DE analysis of 
these samples, a selective removal of albumin, IgG, 
transferrin, and haptoglobin had to be developed. This 
was accomplished by affinity depletion. Interestingly, 
many features in a 2-DE separation are albumin frag- 
ments (in fact, 4% of total features). Their study 
included 512 samples from 228 patients and resulted in 
1,131 features (spots) being annotated. Potential mark- 
ers of Alzheimer's disease were said to be identified. 

Separate from the MS identification issues covered in 
most of the meeting, Kerstin Strupat (Univ. of Muen- 
ster) presented her work on MS analysis of noncovalent 
complexes. Here the challenge is to transfer noncova- 
lent interactions that occur in the condensed phase to 
the gas phase. ESI-MS has been shown by a number of 
groups to work, but MALDI-MS analysis has proved 
more difficult. Examples of MALDI-MS analysis of 
noncovalent protein:protein (streptavidin tetramerand 
the macrophage migration inhibitory factor related 
proteins MRP-8 and MRP- 14) and protein:ligand (al- 
dose reductase:NADP) interactions were presented. 
Interestingly, analysis of the first laser pulse during a 
MALDI-MS analysis often allows investigation of non- 
covalent interactions that are not observed during 
subsequent pulses (16). 

Analysis of complex protein mixtures without gel 
electrophoresis. The first stage of many proteome 
projects is the identification of the components compris- 
ing the system under study. This is of course the first 
step in understanding any biological system. As de- 
scribed above, an increasing (but still limited) number 
of laboratories have access to robotic systems requisite 
for the analysis of large numbers of spots from 2-DE. 



However, a trend in the field is emerging toward the 
elimination of the high-resolution protein separation 
step prior to protein identification by MS. In this 
approach, the entire enriched protein fraction is enzy- 
matically digested (usually with trypsin), and the result- 
ing complex peptide mixture is subjected to data- 
dependent LC-MS/MS. In this approach the peptides 
are separated by both hydrophobicity (RP-HPLC) and 
charge (mlz in the mass spectrometer) prior to ion 
selection by the MS control software (hence, data 
dependent). At this meeting, presentations from five 
groups demonstrated the utility of the approach to 
identify components of complex mixtures. 

Analysis of immunoprecipitated proteins or enriched 
protein fractions (e.g., Golgi complex) by either gel 
electrophoresis followed by in-gel digestion and MS or 
digestion of the entire protein fraction and analysis by 
data-dependent LC-MS/MS using a Qq-TOF was de- 
scribed by Jyoyti Choudhary (Glaxo Wellcome). Batched 
MS/MS spectra were searched using the Mascot pro- 
gram (http://www.matrixscience.com). Data pre- 
sented showed that if the immunoprecipitate was clean, 
then direct digestion of the mixture proved slightly 
more successful than analysis of gel-separated pro- 
teins. When an enriched Golgi complex from rat liver 
was separated by either 2-DE (135 spots) or 1-DE (77 
bands) and in-gel digested followed by LC-MS/MS, 
significantly more proteins were identified from the 
1-DE separation. 

David Arnott (Genentech) described the proteomics 
component of Genentechs Secreted Protein Discovery 
Initiative, which also includes genomic, signal trap, 
expression, and functional analysis. Arnott evaluated 
three methods to identify proteins secreted from hu- 
man umbilical microvascular endothelial cells 
(RUMECs) into 60 ml of serum-free media; 2-DE and 
1-DE (with/without staining) followed by in-gel diges- 
tion, and direct digestion of the entire protein mixture. 
Digests were analyzed using the microcapillary system 
described above. Interestingly, direct digestion followed 
by data-dependent LC-MS/MS identified the most pro- 
teins, but all three methods were complementary in 
their hands (21 proteins identified by all three methods 
but no completely novel gene products). 

Analysis of serum fractionated using the Cohn pH/ 
ethanol precipitation protocol followed by digestion of 
the entire fraction prior to data-dependent LC-MS/MS 
was described by Karl Clauser (Millennium Pharmaceu- 
ticals) in the context of the studies of differences 
between wild type and ApoE mice. Clauser also 
presented the bioinformatics flow for data handling, 
which utilizes a variant of the publicly available MS- 
Tag (http://prospector.ucsf.edu/) for protein identifi- 
cation and a de novo sequence interpretation program 
referred to as SHERENGA (2). Their stated aim is for 
searching to keep up with the LC-MS/MS. They have 
also been experimenting with the IEX ion-exchange 
protocol developed by Andy Link (7) as a means of 
decreasing the complexity of the sample and reducing 
the number of singly charged and highly charged ions 
as these are less likely to be identified. In one IEX 
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fraction, 87 plasma proteins were identified in a single 
run compared with 66 from an un fractionated sample. 

Scott Patterson (Amgen) described Amgen's proteom- 
ics efforts, now in the third year. They are employing 
data-dependent LC-MS/MS of complex protein mixture 
digests. The stated aim is to reduce the complexity such 
that in an ideal situation only one peptide for each 
protein in the mixture is fragmented during LC-MS/ 
MS. To achieve this aim, various affinity methods can 
be employed, and the use of cysteinyl peptide capture 
using either thiopropyl Sepharose or a biotin alkylating 
reagent, AA[6-(biotinamido)hexyl]-3'-(2'-pyridyldithio)- 
propionamide (HPDP-biotin), was described (13). The 
former was used in a large-scale analysis of urinary 
proteins where digestions of the unfractionated start- 
ing material, albumin/IgG depleted, or cysteinyl pep- 
tide captured or noncaptured were analyzed. The 
samples were analyzed with replicate LC-MS/MS runs 
using narrow mass ranges for ion selection for each 
run, thereby increasing the number of unique spectra 
selected for fragmentation. This analysis resulted in 
the identification of over 200 proteins, including a 
number of uncharacterized nucleotide sequences (e.g., 
ESTs). Smaller scale analyses are described above, in 
one case [soluble intermembrane proteins (SIMPs)] 
utilizing cysteinyl peptide capture to identify more 
proteins than with no fractionation. Data handling for 
this high-throughput effort was also briefly described. A 
number of the fractions being analyzed have some of 
the same components; therefore, to enhance the identi- 
fication process, spectral matching of the database (>5 
million spectra) is performed. This links identical spec- 
tra and therefore reduces the redundancy associated 
with re-searching already identified spectra. 

Quantitative analysis of two samples without electro- 
phoresis. MALDI-MS, using the surface enhanced laser 
desorption ionization (SELDI) surface, to search for 
disease markers in biological fluids was presented by 
j Scot Weinbeger (Ciphergen Biosystems). In this ap- 
proach, defined chemical/biochemical surfaces are uti- 
lized to allow fractionation of proteins from biological 
fluids in a reproducible manner. This reproducibility 
allows comparisons between different samples to be 
made. Weinberger described the search for markers of 
benign prostatic hyperplasia that, like prostate cancer, 
displays elevated prostate specific antigen (PSA) levels. 
The fraction exhibiting a difference between these 
samples was able to be enzymatically digested, and a 
number of peptides were generated. These were able to 
be fragmented using the MALDI Qq-TOF of Standing, 
described above. It appears as though there is a differ- 
ence in the relative level of a seminogelin fragment 
between these two diseases, providing a potential differ- 
ential marker. The method is sensitive but apparently 
limited to analysis of proteins less than about 20 kDa (a 
range not well characterized by 2-DE). 

A combination gel/MS approach referred to as a 
'Virtual 2-D gel" was presented by Phil Andrews (Univ. 
of Michigan). In this approach, proteins are separated 
by charge using thin-layer isoelectric focusing 
(IEF), and this gel is then subjected to MALDI-MS. By 



rastering through the entire IEF gel, a composite 
display of all acquired MALDI-MS spectra can be 
generated (hence, the virtual 2-DE). Such analyses 
would provide very accurate mass measurements, 
greatly assisting in postradiational modification 
analyses as well as potentially quantitation. 

Karl Clauser (Millennium Pharmaceuticals) de- 
scribed their efforts at utilizing already existing LC- 
MS/MS data to attempt to gain some quantitative/ 
qualitative information as to differences between 
samples. Differences in serum protein levels between 
wild-type and ApoE -/- mice have been examined 
using this approach, which compares the MS ion cur- 
rent from peptides identified between LC-MS/MS runs 
of each sample. Comparison between runs is a difficult 
task, but data suggested that there is sufficient confi- 
dence to state a significant difference if there is a 
difference of a factor of 3 between some components of 
the samples. 

An LC-MS/MS-based system was described by Steve 
Gygi (Univ. of Washington) for quantitative analysis of 
complex mixtures. The technology is referred to as 
isotope-coded affinity tag (ICAT) (6). The ICAT reagent 
described here is composed of three units: an affinity 
reagent (biotin), a linker region (one of two forms), and 
a reactive group (a thiol-specific reagent, iodoacetic 
acid). The linker region encodes the mass difference, 
with the light version having 8 hydrogens and the 
heavy version having 8 deuteriums. Thus the mass 
difference is 8 mass units (doubly charged ions will 
have an mlz difference of 4). Following reduction and 
alkylation of each of the two protein samples with one 
of the two reagents, the two samples can then be mixed 
together All subsequent manipulations are performed 
as a mixture, culminating in tryptic digestion of the 
complex sample and capture of the cysteinyl peptides 
on avidin. The bound peptides are released and ana- 
lyzed by LC-MS/MS, revealing paired signals of pep- 
tides. Calculation of areas under the peak for each 
paired ion from the LC-MS data provides an accurate 
record of the relative quantities of the proteins from 
each starting sample. The MS/MS spectra allow identi- 
fication of the peptides. The approach was elegantly 
demonstrated with yeast grown on either galactose- 
containing media or ethanol-containing media. Pro- 
teins expected to be differentially regulated were ob- 
served, and, highlighting the advantages of analysis at 
the protein level as opposed to the mRNA level (e.g., 
microarray), alcohol dehydrogenase- 1 (ADH1) was found 
to be oppositely regulated (as expected) to ADH2, to 
which it is 95% homologous. This is a very promising 
approach for quantitative analysis of complex protein 
mixtures. 

A number of interesting posters were also presented 
at the meeting, and some of the presenters were given 
the opportunity to "advertise" their posters. These dealt 
with the same range of subjects presented in the oral 
sessions. 

Conclusion. The organizers Ruedi Aebersold and 
John Stults brought together an excellent program for 
this meeting, with essentially all major laboratories in 
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this field being represented. The field has grown enor- 
mously over the past few years, and advancements 
presented at this meeting indicate an optimistic view of 
the future for proteomics. This very successful meeting 
provided the 162 attendees with the state-of-the-art in 
mass spectrometry and proteomics. 

Address for reprint requests and oiher correspondence: S. D. 
Patterson. Biochemistry. Amgen Inc., One Amgen Center Drive, MS 
14-2-E, Thousand Oaks. CA 91320-1789 (E-mail: spatters@amgen. 
com). 
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